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Abstract 

Kingman derived the Ewens sampling formula for random partitions describing the genetic 
variation in a neutral mutation model defined by a Poisson process of mutations along lines of 
descent governed by a simple coalescent process, and observed that similar methods could be 
applied to more complex models. Mohle described the recursion which determines the general- 
ization of the Ewens sampling formula in the situation when the Unes of descent are governed by 
a A-coalescent, which allows multiple mergers. Here we show that the basic integral representa- 
tion of transition rates for the A-coalescent is forced by sampling consistency under more general 
assumptions on the coalescent process. Exploiting an analogy with the theory of regenerative 
partition structures, we provide various characterizations of the associated partition structures 
in terms of discrete-time Markov chains. 

1 Introduction 

The theory of random coalescent processes starts from Kingman's series of papers |2(J[ 1211 1^ in 
1982. The idea comes from biological studies for genealogy of haploid model given a large 
population with many generations, you track backward in time the family history of each individual 
in the current generation. As you track further, the family lines coalesce with each other, eventually 
all terminating at a common ancestor of current generation. The same mathematical process may 
be interpreted in other way as describing collisions of an aggregating system of physical particles. 
In Kingman's coalescent process [201) each collision only involves two parts. This idea is extended 
to coalescent with multiple collisions in |31II33) . where every collision can involve two or more parts. 
This model is further developed into the theory of coalescent with simultaneous multiple collisions 
in ISnUSS]. See ISZIEniElEllMIElEl for related developments. 

Kingman 1221 indicated a basic connection between random partitions of natural interest in 
genetics, and coalescent processes. Suppose in the haploid case the family line of current generation 
is modeled by Kingman's coalescent, and the mutations are applied along the family lines by using 
a Poisson process with rate 9/2 for some non- negative number 9. Define a partition by saying 
that two individuals are in the same block if there is no mutation along their family lines before 
they coalesce. Then the resulting random partition is governed by the Ewens sampling formula 
with parameter 9. See |28l Section 5.1, Exercise 2] and |21 03 for review and more on this idea. 
Recently, Mohle [23 applied this idea to the genealogy tree modeled by coalescents with multiple 
collisions and simultaneous multiple collisions. He studied the resulting family of partitions, and 
derived a recursion which determines them. In |24|. Mohle showed that the partition derived from 
coalescent with multiple collisions is regenerative in the sense of [21 if and only if the underlying 
coalescent is Kingman's coalescent or a hook case, corresponding to the extreme cases when the 
characterization measure A of coalescent with multiple collisions concentrates at or 1, respectively. 
In particular, the intersection of Mohle's family of partitions with Pitman's two-parameter family 
is the one-parameter Ewens' family. 
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Here we offer a different approacli to tlie family of random partitions generated by Poisson 
marking along the lines of descent of a A-coalescent. We study partitions with an additional fea- 
ture, assigning each part one of two possible states: active or frozen. We introduce a new class of 
continuous time partition-valued coalescent processes, called coalescents with freeze, which are char- 
acterized by an underlying measure determining collision rates, together with a freezing rate. Every 
coalescent with freeze has a terminal state with all blocks frozen, called the final partition of this 
process, whose distribution is characterized by the recursion of Mohle [23]. In the spirit of [141 115) . 
we focus here on the discrete time chains embedded in the coalescent with freeze, and from the 
consistency of their transition operators we derive a backward recursion satisfied by the decrement 
matrix, analogous to jl4l Theorem 3.3]. This decrement matrix determines the partition through 
Mohle's recursion. As in |14' , we use algebraic methods to derive an integral representation for the 
decrement matrix. Also, adapting an idea from |15| . we establish a uniqueness result by constructing 
another Markov chain, with state space the set of partitions of a finite set, whose unique stationary 
distribution is the law of the final partition restricted to this set. We analyze in detail the case 
of coalescent with freeze when no simultaneous multiple collisions are permitted, leaving the more 
general case to another paper. 

The remaining part of the paper is organized as following. Some notations and background are 
introduced in Sectional together with a review of Mohle's result. In Section O the coalescent with 
freeze is defined and the relation between our method and Mohle's method is discussed. In Section 

01 we detail the study of coalescent with freeze in terms of the freeze-and-merge (FM) operators of 
the embedded finite discrete chain, whose consistency with sampling derives a backward recursion 
for the decrement matrix. In Section 5 the Markov chain with sample-and-add (SA) operation is 
introduced, and the law of the partition in our study is identified as the unique stationary distribution 
of this chain. In Section 6 we derive the integral representation for an infinite decrement matrix. 
This gives another approach to Mohle's partitions via consistent freeze-and-merge chains, which may 
be seen as discrete-time jumping processes associated with the A-coalescent with freeze. Sectional 
provides an alternate approach to the representation of an infinite decrement matrix in terms of a 
positivity condition on a single sequence. Section |S1 offers some results about the structure of the 
random set of freezing times derived from a coalescent with freeze. Finally, in Section|51we point out 
some striking parallels with our previous work on regenerative partition structures, which guided 
this study. Section ^| mentions briefiy some further parallels with the theory of homogenous and 
self-similar Markovian fragmentation processes due to Bertoin . 

2 Some notation and background 

Following the notations of j2Hli for any finite set F, a partition of F into i blocks, also called a finite 
set partition, is an unordered collection of non-empty disjoint sets {Ai, . . . , Ai} whose union is F. 
In particular we consider partitions of the set [n] :— {1,2, . . .,n} for n £ N. We use V[n] to denote 
the set of all partitions of [n\. A composition of the positive integer n is an ordered sequence of 
positive integers (ni, rt2, . . . , ne) with X]i=i ~ where ^ 6 N is number of parts. We use C„ to 
denote the set of all compositions of n, and Vn to denote the set of non-increasing compositions of 
n, also called partitions of n. 

Let 7r„ = {Ai, A2, . . . , Ai} denote a generic partition of [n\; we may write 7r„ h [n] to indicate 
this fact. The shape function from partitions of the set [n] to partitions of the positive integer n is 
defined by 

shape(7r„) = IA2I, . . . , (1) 

where \Ai\ is the size of block Ai which represents the number of elements in the block, and "J." 
means arranging the sequence of sizes in non-increasing order. 

A random partition n„ of [n] is a random variable taking values in V^n] ■ It is called exchangeable 
if its distribution is invariant under the action on partitions of [n] by the symmetric group of 
permutations of [n] . Equivalently, the distribution of n„ is given by the formula 

P(n„ = {A,,A2, A,}) = IA2I, . . . , \Ae\) (2) 
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for some symmetric function p„ of compositions of n. We call Pn the exchangeable partition proba- 
bility function (EPPF) of n„. 

An exchangeable random partition o/ N is a sequence of exchangeable set partitions IIoo = 
(n„)5JL]^ with n„ h [n], subject to the consistency condition 

where the restriction operator |m acts on 7^[n], n > m, by deleting elements m + 1, m + 2, . . . , n. 
The distribution of such an exchangeable random partition of N is determined by the function p 
defined on the set of all integer compositions Coo '■— U^j^Ci, which coincides with the EPPF pn of 
n„ when acting on C„. This function p is called the infinite EPPF associated with IIoo = ij^n)'^=i- 
The consistency condition (jSJ translates into the following addition rule for the EPPF p: for each 
positive integer n and each composition (ni, rt2, • ■ • , ni) of n, 

I 

p{ni,n2, ...,ni)^ p{ni,n2, ...,ne,l) + ^p{ni, . . . , + 1, . . . , rif) (4) 

1=1 

where (ni, . . . , + 1, . . . , ui) is formed from (ni, . . . , ni) by adding 1 to n^. Conversely, if a non- 
negative function p on compositions satisfies (@J and the normalization condition p(l) = 1, then by 
Kolmogorov's extension theorem there exists an exchangeable random partition IIoo with EPPF p. 

Similar definitions apply to a finite sequence of consistent exchangeable random set partitions 
{Ilm)m=i with Urn I" [™] , whcrc u is some fixed positive integer. The finite EPPF p of such a 
sequence can be defined as the unique recursive extension of Pn by the addition rule Q to all 
compositions (ni, n2, ■ • ■ , ng) oi m <n. 

Let "Poo be the set of all partitions of N. We identify each tToo G "Poo as the sequence (tti , 7r2, . . .) G 
^[1] X ^[2] ^ ' ' ' 1 where 7r„ = TTooln is the restriction of tToo to [n] by deleting all elements bigger 
than n. Give "Poo the topology it inherits as a subset of V[i] x V[2] x ■ ■ ■ with the product of discrete 
topologies, so the space Poo is compact and metrizable. Following 1201 mj . call a Poo-valued 
stochastic process {Iloo{t),t > 0) a coalescent if it has cadlag paths and noo(s) is a refinement of 
noo(i) for every s < t. For a non- negative finite measure A on the Borel subsets of [0,1], a A- 
coalescent is a Poo-valued Markov coalescent {Iloc{t),t > 0) whose restriction (Jln{t),t > 0) to [n] 
is for each n a Markov chain such that when n„(i) has b blocks, each fc-tuple of blocks of n„(t) is 
merging to form a single block at rate Xb^k, where 

Xb,k= [ x'^-'^il-xf-'^Aidx) (2<fc<6<oo). (5) 
Jo 

The measure A which characterizes the coalescent is derived from the consistency requirement, that 
is for any positive integers < m < n < oo, and 7r„ h [n], the restricted process (n„(t)|,„, t > 0) 
given n„(0) = 7r„ has the same law as {Ilra{t), t > 0) given 11™ (0) — 7r„|m. This condition is fulfilled 
if and only if the array of rates {Xb,k) satisfies 

Xb,k = h+iM + Ab+i,fc+i {2 <k <b < oo). (6) 

The integral representation (0) can be derived from via de Finetti's theorem WD Lemma 18]. 

When A = Sq, this reduces to Kingman's coalescent [2011231^ with only binary merges. When 
A is the uniform distribution on [0, 1], the coalescent is the Bolthausen-Sznitman coalescent In 
[36) this construction is further developed to build the S-coalescent where the measure S on infinite 
simplex characterizes the rates of simultaneous multiple collisions. 

Mohle studied the following generalization of Kingman's model j^. Take a genetic sample 
of n individuals from a large population and label them as {1, 2, . . . , n}. Suppose the ancestral lines 
of these n individuals evolve by the rules of a A-coalescent, and that given the genealogical tree, 
whose branches are the ancestral lines of these individuals, mutations occur along the ancestral lines 
according to a Poisson point process with rate p > 0. The infinite-many-alleles model is assumed, 
which means that when a gene mutates, a brand new type appears. Define a random partition of 
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[n] by declaring individuals i and j to be in the same block if and only if they are of the same 
type, that is either i — j or there are no mutations along the ancestral lines of i and j before these 
lines coalesce. These random partitions are exchangeable, and consistent as n varies. The EPPF of 
this random partition is the unique solution p with p{l) = 1 of Mohle's recursion: for each positive 
integer n and each composition (rti, 712, ... , rii) of n, 

q{n : 1) " ("') 

p{ni,n2,...,ni) ^ — - — ^ p(. . . , n^, . . .) + ^ g(n : /c) ^ -^p{. . . ,nj ~ k + 1, . . .), (7) 

where (. . . , . . .) is formed from (ni , 712 , . . . , rif) by removing part n-,- , (. . . , — + 1 , . . .) is formed 
from (tt-i, 712, . . . , ni) by only changing n,j to Uj ~ k + 1, and (7(6 ; k) is the stochastic matrix 

g(6:fc) = ^^^-^ (l<fc<6<n), (8) 



where 



$(6:1) = p6, (9) 
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= (nAb,fe=r)/ a;'=-2(l-x)''-'=A(da;) (2 < fc < &) , (10) 







m = ^ $(6 -.k)^ ~ "/^^^ " ^^''^ A(d:r) + p6. (11) 

k=l ^ 

If at some time t > there are exactly b lines of descent whose associated genealogical trees of 
depth t contain no mutations, then <i>(6 : 1) is the total rate of mutations along one of these b lines, 
<&(& : k) is the total rate of fc-fold merges among these lines, and <J>(6) is the total rate of events of 
either kind. 

Mohle [231 derived the recursion (|7|) by conditioning on whether the first event met tracing back 
in time from the current generation is a mutation or collision. On the left side of (|7|), p{ni, 712, . . . , ni) 
is the probability of ending up with any particular partition 7r„ of the set [n] into £ blocks of sizes 
{ni,n2, ■ ■ ■ ,ni). On the right side, q{n : 1) is the chance that starting from the current generation, 
one of the n genes mutates before any collision; for this to happen together with the specified 
partition of [n], the individual with this gene must be chosen from those among the singletons of 
7r„, with chance l/n for each different choice, and after that the restriction of the coalescent process 
to a subset of [71] of size tt, — 1 must end up generating the restriction of 7r„ to that set. Similarly, 
q{n : k) is the chance that the first event met is k out of n genes coalescing to the same block. 
Again, the k individuals bearing these k genes must be chosen from a block of 7r„ of size Uj > k, 
so the chance for possible choices from a block with size nj is and given exactly which 

k individuals are chosen, the restriction of the coalescent process to some set of n — fc + 1 lines 
of descent must end up generating a particular partition of these n — k + I lines into sets of sizes 
{. . . ,nj — k + 1, . . .). The multiplication of various probabilities is justified by the strong Markov 
property of the A-coalescent at the time of the first event, and by the special symmetry property 
that lines of descent representing blocks of individuals coalesce according to the same dynamics as 
if they were singletons. 

In this paper we step back from these detailed dynamics of the A-coalescent with mutations 
to consider the following questions related to Mohle's recursion ([TJl and associated partition-valued 
processes. We choose to ignore the special form (jSJ of the matrix {q{n : fc); 1 < fc < rt < 00) derived 
from the (A, p), and analyse Mohle's recursion |7} as an abstract relation between a stochastic matrix 
q and a function of compositions p. In particular, we ask the following questions: 

1. For which probability distributions q{n : fc), 1 < fc < n, on [n] is Mohle's recursion 
satisfied by the EPPF p of some exchangeable random partition of [n] , and is this p uniquely 
determined? 

2. How can such random partitions be characterized probabilistically? 
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3. Can such random partitions of [n] be consistent as n varies for any other q besides q derived 
from (A, p) as above? 

We stress that in the first two questions the recursion lO is only required to hold for a single value 
of n, while in the third question ((7|) must hold for all n — 1,2, . . .. The answer to the first question 
is that for each fixed probability distribution q{n : k), I < k < n, on [n], Mohle's recursion ||7J) 
determines a unique EPPF p for an exchangeable random partition of [n] (Theorem 1^. Answering 
the second question, we characterize the distribution of this random partition in two different ways: 
firstly as the terminal state of a discrete-time Markovian coalescent process, the freeze- and-merge 
chain introduced in Section ^ and secondly as the stationary distribution of a partition- valued 
Markov chain with quite a different transition mechanism, the sample- and- add chain introduced 
in Sectional The answer to the third question is positive if we restrict n to some bounded range 
of values, for some but not all q (see Section 0}, but negative if we require consistency for all n 
fTheorem ll3|) : if an infinite EPPF p solves Mohle's recursion {Tj) for all n for some triangular matrix 
q with non-negative entries, then q must have the form ^ for some (A,p). 

We were guided in this analysis by a remarkable parallel between this theory of finite and infinite 
partitions subject to Mohle's recursion lO and the theory of regenerative partitions developed in 
|14l . Following the terminology in |14l , we call a triangular stochastic matrix a decrement 
matrix. We use the notation g„ — {q(b : fc); 1 < fc < 6 < n) or qoo — {q{n '■ k); 1 < k < n < oo) 
to indicate whether we wish to consider finite or infinite matrices. Thus, the entries of a decrement 
matrix are nonnegative and satisfy X]fc=i : k) = 1 for all b in the required range. In present 
notation, the characteristic property of a regenerative partition is that its EPPF p satisfies 



for some decrement matrix q — q^o ■ The main results of [141 115j gave similar answers to the above 
questions for this recursion instead of Mohle's recursion 0. 

There is an important distinction between the recursion on the one hand and Q and (|12|l on 
the other hand. The recursion Q has many solutions since it is a backward recursion, from larger 
values of n to smaller. By contrast, both O and H12|l are forward recursions, from smaller values of 
n to larger. Consequently it is obvious that given an arbitrary infinite decrement matrix goo, each 
of the recursions |7J| and (|12|l has a unique solution p with the initial value p{l) = 1. Moreover, it 
is clear that each of these functions p can be written as a linear combination of products of entries 
of the qoo matrix. 

To illustrate the close parallel between the two recursions (TJ and (|12|l , we list the first few values 
of p in terms of the decrement matrix q, first for Mohle's recursion Q: 



P(l) = l, 

p{2) - q{2 : 2), 
p(l,l) = q(2: 1), 

p(3) = q(3:3)+<z(3:2)g(2:2), 
p(2,l)=p(l,2) 

= ^g(3:2)g(2:l) + ig(3:l)g(2:2), 

p(l,l,l) = g(3:l)g(2:l), 

p(4) = g(4 : 4) + g(4 : 3)g(2 : 2) + g(4 : 2)g(3 : 3) -t- g(4 : 2)(j(3 : 2)q{2 : 2), 




(12) 
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p(3,l)=p(l,3) 



ig(4 : 3)<z(2 : 1) + i<z(4 : 2)q{3 : 2)q{2 : 1) + ig(4 : 2)q{3 : l)g(2 : 2) + ig(4 : l)g(3 : 3) 
+ ^g(4 : l)q(3 : 2)(z(2 : 2), 



p(2,l,l)-p(l,2,l)=p(l,l,2) 

= ig(4 : 2)<7(3 : l)q{2 : 1) + ig(4 : l)g(3 : 2)q(2 : 1) + iq(4 : l)g(3 : l)q{2 : 2), 
6 6 6 

p(l,l,l,l)-<z(4: l)g(3: l)(z(2: 1). 

Note that for a general transition matrix q these functions p may not be consistent as n varies, 
meaning that may fail. A condition on q^o equivalent to consistency of p will be described later 
in Lemma 

Similarly, the first few values of the p determined by a decrement matrix q via the recursion H12|l 
associated with a regenerative partition structure are: 

p{2) = q(2 



p{l,l)=q{2 
p(3) = g(3 



1), 
3), 

p(2,l)=p(l,2) 



= i<z(3 : 2) + ig(3 : l)(z(2 : 2), 

p(l,l,l)=9(3:lM2:l), 
p(4) - g(4 : 4), 
p(3,l)=p(l,3) 

= iq(4:3) + ig(4:l)q(3:3), 
p(2,l,l)-p(l,2,l)-p(l,l,2) 

= iq(4 : 2)g(2 : 1) + ig(4 : l)g(3 : 2) + ig(4 : l)<z(3 : l)q{2 : 2), 
6 6 6 

p(l,l,l,l)-«(4:l)g(3:l)g(2:l). 

Looking at these displays, both similarities and differences may be observed. In particular, the 
formulas for singleton partitions (1, 1, . . . , 1) are identical. As is to be expected, the simpler recursion 
(I12|l for regenerative partitions generates simpler algebraic expressions than Mohle's recursion {Tj). 
See ^1 Equation (16)] (reproduced as H34() below) for the general formula for the shape function 
associated with (IT^ . 

In principle, the recursions Q and (|12|l have probabilistic meaning for arbitrary decrement 
matrix q, since they determine a sequence of exchangeable partitions of [n]'s for n in some finite 
or the infinite range. Distributions of these partitions are obtained algebraically as above, by fully 
expanding p through q. However, typically these partitions of n are not consistent with respect to 
restrictions, so in the infinite case they might not determine the distribution of a partition of N. 



3 Coalescents with freeze 

To provide a natural generalization of partition structures derived from a coalescent with Poisson 
mutations along the branches of a genealogical tree, we consider the structure of a partition of a set 
(respectively, of an integer) with each of its blocks (or parts) assigned one of two possible conditions, 
which we call active and frozen. We call such a combinatorial object a partially frozen partition of 
a set or of an integer, as the case may be. Ignoring the conditions of the blocks of a partially frozen 
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partition tt* induces an ordinary partition tt. As special cases of partially frozen partititions, we 
include the possibilty that all blocks may be active, or all frozen. We use the symbol E* for the 
pure singleton partition of [n] with all blocks active, and for the sequence (S* )^]^. The *-shape 
of a partially frozen partition vr* of [n] is the corresponding partially frozen partition of n, and the 
ordinary shape is defined in terms of the induced partition 7r„. 

For each positive integer n, we denote V*^-, the set of all partially frozen partitions of [n]. Let 
be the set of all partially frozen partitions of N. We identify each element tt^ e "P^ as the sequence 
(7r*,7r2, . . .) e V*^ X Pj*2] X ■ • ■ , where tt* is 7r^|„ the restriction of tt^ to [n]. Endowing with 
the topology it inherits as a subset of V*-^-^ x T'j* j x • • • , the space is compact and metrizable. We 
call a random partially frozen partition of [n] exchangeable if its distribution is invariant under the 
action of permutations of [n\. Similarly to |Pl2Qj, call a P^-valued stochastic process {Jl%^{t),t > 0) 
a coalescent if it has cadlag paths and n^(s) is a *-refinement of 11^ (i) for every s < t, meaning 
that the induced partition noo(s) is a refinement of Iloo{t) and the set of frozen blocks of ni^(s) is 
a subset of the set of frozen blocks of Ili^ (t) . 

The construction of an exchangeable random partition of N by cutting branches of the merger- 
history tree of a A-coalescent {Hoo {t),t > 0) by mutations with rate p can now be formalized as 
follows. For each i e N let denote the random time at which a mutation first occurs along the 
line of descent to leaf i of the tree, and declare the block of Iloo{t) containing i to be active ii Ti > t 
and frozen if < t. This defines a P^-valued Markov process (n^(t),i > 0). As i ^ oo the state 
n^(t) approaches a limit n^(cxD) with all blocks frozen. This is the exchangeable random partition 
generated by the exchangeable sequence of random variables (r^, i e N), meaning that two integers 
i and j are in the same block of n^(cxD) iff — Tj. Assuming that 11^(0) — E^^, it should be clear 
that the EPPF of n|^(oo) is that defined by Mohle's recursion 0. The following two theorems 
present more formal statements. 

Theorem 1. Let (Af,,fc,2 < k < b < oo), (pn,l < n < oo) be two arrays of non-negative real 
numbers. There exists for each tt^ e a V^-valued coalescent {Jl*^{t),t > 0) with 11^(0) = tt^, 
for each n whose restriction (n,* (t),t > 0) to [n] is a V^^^yvalued Markov chain starting from vr* = 
7r^|„, and evolving with the rules: 

• at each time t >0, conditionally given H* (i) with b active blocks, each k-tuple of active blocks 
o/n*(t) is merging to form a single active block at rate Xb^k, ind 

• each active block turns into a frozen block at rate pn.b, 

if and only if the integral representation (jSJ holds for some non-negative finite measure A on the 
Borel subsets of [0, 1], and pn.t — P for some non-negative real number p. This V^-valued process 
(n^(t),t > 0) directed by (A,/?) is a strong Markov process. For p = 0, this process reduces to the 
A-coalescent, and for p > the process is obtained by superposing Poisson marks at rate p on the 
merger-history tree of a A-coalescent, and freezing the block containing i at the time of the first mark 
along the line of descent of i in the merger-history tree. 

Proof. Just as in [2], consistency of the rate descriptions for different n implies that © holds, hence 
the integral representation and equality of the pn^b's is also obvious by consistency. □ 

Definition 2. Call this P^-valued Markov process directed by a non-negative integer p and a 
non- negative finite measure A on [0, 1] the A-coalescent freezing at rate p, or the [A, p)- coalescent 
for short. Call a (A, p)-coalescent starting from state Yj*^ a standard A-coalescent freezing at rate p, 
where E^ is the pure singleton partition with all blocks active. 

Consider the finite coalescent with freeze (n,* (t),t > 0) which is the restriction of a standard 
A-coalescent freezing at rate p to [n]. According to the description above, all active blocks will 
coalesce by the rules of a A-coalescent, except that every active block enters the frozen condition at 
rate p, and after that the block will stay frozen forever. Hence it is clear that as long as the freezing 
rate p is positive, in finite time the process (n*(t),t > 0) will eventually reach a final partition E'*, 
with all of its blocks in the frozen condition. 
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Now recall Mohle's model |^ as reviewed in Section 13 The ancestral lines of n labeled genes 
of current generation coalesce as a A-coalescent, and mutations happen along each ancestral line 
as Poisson point process with rate p > 0. Hence the final partition of [n] is defined so that if the 
ancestral line of an individual is interrupted by a mutation before the line coalesces with any other 
ancestral lines, the individual will be a singleton in the partition. This corresponds to the idea 
of freezing here: tracing evolution of a particle starting from time 0, if a particle freezes before 
coalescing with others, it will enter as a singleton block in the final partition of the process. 

To detail the study, let us look at the discrete chain embedded in A-coalescent freezing at rate 
p. By the definition, for each time t > 0, H* (i) is a partially frozen exchangeable random partition 
of [n], hence its induced form Iln{t) gives an exchangeable random partition of [n]. So does the 
final partition E* = n*(oo) and its induced form En- Set E^ :— {E*) as the final partition of 
(n^(i),< > 0), and denote its induced partition as Eoo — {En). The following facts can be read 
from the existence of (ni^{t), t > 0) and Mohle's analysis recalled around 

Theorem 3. ( Mohle Theorem 3.1]) The induced final partition E^o = {En)^^i of a standard 
A-coalescent freezing at rate p > is an exchangeable infinite random partition ofN whose EPPF p 
is the unique solution of Mohle 's recursion {TJ with coefficients from the infinite decrement matrix 
Qoo defined through (A,p) as in ((SJ. 

4 Freeze-and-merge operations 

Given a stochastic process X indexed by a continous time parameter t > 0, assuming X has right 
continuous piecewise constant paths, the jumping process derived from X is the discrete-time process 

X = (X(0), X{1), . . .) - iX{To), X(Ti), X{T2), . . .) 

where Tq :— and Tk for fc > 1 is the least t > Tk-i such that X{t) ^ X{Tk-i), if there is such a i, 
and Tk = Tk-i otherwise. The processes X of interest here will ultimately arrive in some absorbing 
state, and then so too will X. In particular, the finite coalescent with freeze (11* (t), t > 0), obtained 
by restriction to [n] of a A-coalescent freezing at positive rate p, is a Markov chain with transition 
rate (^j}j\h,k for a fc-merge and rate bp for a freeze, where b is the number of active blocks at time t 
and the Af,jt's are as in ((S)); while the jumping process 11* is then a Markov chain governed by the 
following freeze-and-merge operation FM„, which acts on a generic partially frozen partition tt* of 
[n] as follows: if tt* has 6 > 1 active blocks then 

• with probability q(b : k) some fc of 6 active blocks are chosen uniformly at random and merged 
into a single active block (for 2 < k < b), 

• with probability q{b : 1) an active block is chosen uniformly at random from b blocks and 
turned into a frozen block. 

In the case b — 1 only the second option is possible, that is ^(1 : 1) = 1, and when all blocks of tt* 
are in frozen condition, the operation is defined to be the identity. For the A-coalescent freezing at 
positive rate p, we know that 

• (i) the decrement matrix q is of the special form (jS)), and 

• (ii) the continuous time processes H* (<) are Markovian and consistent as n varies, meaning 
that n^(t) for TO < n coincides with 11* (t)|„i, the restriction of 11* (t) to [m]. 

Note that FM„ always reduces the number of active blocks, in particular it transforms a partition of 
[n] with 5 > 1 active blocks into some other partition of [n] with 6—1 active blocks with probability 
q{b:l) + q{b:2). 

To view Mohle's recursion Q in greater generality, we consider this freeze-and-merge operation 
FM„ for n some fixed positive integer, and (?„ a finite decrement matrix. Let (11* (fc), fc = 0, 1, 2, . . .) 
be the Markov chain obtained by iterating FM„ starting from 11* (0) = E* . Since FM„ is defined in 
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terms of *-shapes, each H* (fc) is a partially frozen exchangeable partition of [n]. The FM„-chain is 
strictly transient, in the sense that it never passes through the same state until it reaches a partially 
frozen partition E* , all of whose blocks are frozen. Let En be the induced partition of [n] , which 
we call the final partition and regard En as the outcome of random transformation of exchangeable 
partitions Sn ^ S* ^ E* En- 

Observe that for m — 1, . . . ,n the first m rows of the decrement matrix qn comprise a decrement 
matrix g.,„ which itself defines a freeze- and- merge operation FM™ on partially frozen partitions of 
[m]. Hence for given g„ we can also define a final partition Em of the FMm-chain. Note that FM„ 
is essentially an operation on the set of active blocks, regardless of their contents, sizes, and the 
configuration of frozen blocks. 

Lemma 4. Given an arbitrary decrement matrix Qn, let p be the function on Uj^^j^Cm whose 
restriction to Cm is the EPPF of Em, the final partition generated by the FM„j chain, for 1 < m < n. 
Then p satisfies Mohle's recursion Q for each composition (rti, 712, . . . , rie) G Cn- 

Proof. A particular realization of £"„ with sliape(i?„) — (ni, . . . , ui) occurs when either 

• (a) some block {j} of En appears as a frozen singleton in FM„(I]*) and all other singletons 
{i} ^ {j} evolve to form a partition with shape (. . . , 1, . . .); or 

• (b) the first iteration of FM„ merges some singletons {ji}, . . . , {jk} (fc > 1) in a single active 
block which enters completely one of the blocks of En- 

By the definition of p and the last remark before the lemma, the probability of the event (a) is 



because after {j} gets frozen the operation FM„ is reduced to FM„_i acting on partially frozen 
partitions of [n] \ {j}. Similarly, the probability of (b) is 



because after creation of the active block {ji, . . . ,jk} the iterates of FM„ can be identified with that 
of FM„_fc+i acting on partially frozen partitions of [n] \ {j2, ■ . . ,jk}- Summation over all possible 



In the general setting of Lemma 01 the sequence of exchangeable final partitions (Em)m=i need 
not be consistent with respect to restrictions. We turn next to the constraints on q imposed by the 
following stronger consistency condition: 

Definition 5. For a decrement matrix q„ and 1 < m < n, call the transition operators FM„ 
and FMm derived from qn consistent if whenever 11* is a Markov chain governed by FM„, the jump 
process derived from the restriction of 11* to [m] is a Markov chain governed by FM^. Call the 
decrement matrix g„ consistent if this condition holds for every 1 < m < n. 

As the leading example, it is clear from consistency of the continuous time chains (11* (i), t > 0) 
which represent a (A, p)-coalescent, that for every n the corresponding decrement matrix qn is 
consistent. The following lemma collects some general facts about consistency. The proofs are 
elementary and left to the reader. Let FM„(7r,*) denote the random partition obtained by action of 
FM„ on an initial partially frozen partition vr* of [n] , 

Lemma 6. Given a particular decrement matrix qn-' 

(i) For fixed 1 < m < n the transition operators FM™ and FM„ are consistent if and only if for 
each partially frozen partition tt* of [n] , there is the equality in distribution 



-■q{n: l)p{. . . ,1, . . .), 




choices yields ((TJ. 



□ 



FM™«U) =FM„«)|| 



m 
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where on the left side 7r*|,„ is the restriction of to [to], and on the right side the notation ||„j 
means the restriction to [m] conditional on the event FM„(7r* |„i) ^ tt* |m that FM„ freezes or merges 
at least one of the blocks of tt* containing some element of [to] . 

(ii) //FMm-i and FMm are consistent for every I < m < n, then so are FMm and FM„ for 
every I < m < n; that is, qn is consistent. 

Lemma 7. A decrement matrix qn is consistent if and only if it satisfies the backward recursion 

q{b ■■ k) = ^q{b + I : k + 1) + + 1 : k) 

+ -^qib+l:l)q(b:k) + -^qib+l:2)q{b:k) (2 < fc < 6 < 7i), (13) 

q{b-l) = T^q{b+l:l) + T^q{b+l:l)q{b:l) + -^q(b+l:2)q{b:l) (1 < 6 < n). (14) 
0+1 0+1 0+1 



Consequently, each probability distribution q{n : ■) on [n] determines a unique consistent decrement 
matrix qn with this nth row. 

Proof. Consider FM„ and FM„_i applied to S* and that is the partitions into singletons, all 

in the active condition. For k < n—1, FM„„i operates by coalescing {1, . . . , fc} into an active block 
with probability 

q{n — 1 : k) 



(15) 



As for the jumping process of (FM„ restricted to [n—1]), the probability of a coalescence of {1, ... , k} 
into an active block is the sum of the following four parts, depending on the development of the 
FM„ chain. Let Ti be the time of the first change in the restriction of the FM„ chain to [n — 1] . To 
obtain the required coalescence, either Ti = 1 and the state after a single step of FM„ comes from 
S* by coalescing {1, . . . , /c, n} or {1, . . . , fc}, these occurring with probability 

q{n:k + l) ^ q{n : fc) ^ ^^^^ 



\k+lJ 



or Ti = 2 with FM„ acting on S* by first freezing {n} then coalescing {1,2,..., fc}, or first coalescing 
{n} with one of other n—1 singletons, leaving 1, 2, . . . fc in fc distinct blocks, then coalescing these 
fc blocks at the next step; these ways occur with probability 

q{n ■■ 1) g(n - 1 : fc) (n - l)g(n : 2) g(n - 1 : fc) 

- ■ n') (^) ■ n') " ^ ^ 

Equate H15(l with the sum of H16(l and (|17|l to get H13|l for b = n — 1. In much the same way, FM„_i 
may act on by freezing {1} with probability 

St^l^. (18) 
n—1 

While for the jumping process of (FM„ restricted to [n—1]), to get the required form, either Ti — 1 
and FM„ acts on E* by freezing {1} with probability 

(19) 



or Ti = 2 and the result is obtained from E* by first freezing {n} then freezing {1}, or first coalescing 
{n} with one of other n — 1 singletons then freezing the block containing 1, these ways occurring 
with probability 

Qjn ■■ 1) g(n-l:l) , (n - l)g(n : 2) g(n - 1 : 1) 

n ■ n-1 + f") ■ n-1 ■ ^ ' 
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Equate lfTH|) with the sum of lfTO|l and (PH)) to get ((T^ for 6 = ?i — 1. Combine them to get ([Till for 

6 = n — 1 . The recursions for b < n foUow by replacing rt by 6 + 1 . 

Conversely, granted the recursions (|13() and H14() . in order to prove consistency it is enough to 
check the case m — n ~ 1, and this is done by application of Lemma El D 



Lemma 8. For 1 < m < n let E,n he the final partition of the FM^- chain starting in state I]*^. // 
the decrement matrix g„ is consistent then the finite sequence of exchangeable random set partitions 
{Em)m=i consistent in the sense that 



Em — I m ■ 



The finite EPPF p of (£'m)m=i then satisfies Mdhle's recursion lO for all compositions of m < n 
in the left hand side. 

Proof. The consistency in distribution is clear. To show I7|l it is enough to look at the case with 
compositions of n on the left hand side, for which Lemma ^applies. □ 



Here is our principal result regarding finite partitions satisfying (jT)): 
Theorem 9. For a positive integer n > 1 and arbitrary probability distribution q{n : •) on [n] 

(i) there exists a unique finite EPPF p for a consistent sequence of random set partitions (Jimym=i 
which satisfies Mdhle 's recursion 10 for all compositions of n on left hand side, 

(ii) this finite EPPF p satisfies Mdhle 's recursion Q for all compositions of positive integers m < n 
on the left hand side with coefficients q{ra : •) derived from q{n : ■) by the recursion (jl^jl . p4|l . 

(lit) for each I < m < n the distribution of Tim determined by the restriction of this EPPF p 
to compositions of m is that of the final partition of the FM„i Markov chain with decrement 
matrix qm defined by (ii), starting from state S*^. 

Proof. We apply Lemma |H1 Given arbitrary probability distribution q(n : •) on [n], we can define all 
q{m : •), 1 < m < n, by the backward recursion l(T^ . ifHI) . Then we use the decrement matrix g„ 
with these rows to build a sequence of Markov chains: for each m, the chain (Jlm{k), fc = 0, 1, 2, . . .) 
starts from and evolves according to FM™. The sequence of induced final partitions {Em)m=i 
of these chains has EPPF p which satisfies recursion . Hence the existence part of (i) follows. We 
postpone the proof of uniqueness in part (i) to the next section. The assertions (ii) and (iii) follow 
directly from this construction. □ 



5 The sample-and-add operation 

Given a probability distribution q{n : •) on [n], we now interpret Mohle's recursion (O as the system 
of equations for the invariant probability measure of a particular Markov transition mechanism on 
partitions of [n], and show that this invariant probability distribution is unique. This will complete 
the proof of Theorem]^ 

Consider the following sample-and-add random operation on Pin], denoted SA„. We regard a 
generic random partition n„ h [n] as a random allocation of balls labeled 1, . . . ,n to some set of 
nonempty boxes, which the operation SA„ transforms into some other random allocation H^. Fix 
q{n : •), a probability distribution on [n] and let Kn be a random variable with this distribution 
q{n : •). Given Kn — k and n„ = 7r„, 

• if fc = 1, first delete a single ball picked uniformly at random from the balls allocated according 
to 7r„, to make an intermediate partition of some set of n — 1 balls, then add to this intermediate 
partition a single box containing the deleted ball. 
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• if fc = 2, . . . , 71, delete a sequence of fc — 1 of the n balls from 7r„ by uniform random sampling 
without replacement, to obtain an intermediate partition of some set of rt — A; + 1 balls, then 
mark a ball picked uniformly from these n — k + 1 balls, and add the fc — 1 sampled balls into 
the box containing the marked ball. 

In either case delete empty boxes in case any appear after the sampling step. The resulting partition 
of [n] is n'j. For each q(n : •), this defines a Markovian transition operator SA„ on partitions of [n]. 

Lemma 10. Let n„ be an exchangeable random partition of [n] with finite EPPF p defined as 
a function of compositions of m for 1 < m < n. Let II'^ be derived from Tin by the SA„ operation 
determined by some arbitrary probability distribution q{n : •) on [n]. Then 11^ is an exchangeable 
random partition of [n] whose EPPF p' is determined on compositions of [n] by the formula 

p'(ni,n2,...,n^) = ^^^^ ^ p{. . . . . .) + ^q{n : k) ^ . . , _ fc + 1, . . .). (21) 

j:n j=l k=2 j:nj>k Vfe/ 

Note. The right side of H21|l is identical to the right side of Mohle's recursion (0. 

Proof. Let Kn with distribution q{n : ■) be the number of balls deleted in the SA„ operation. For 

each partition tt^ of [n] we can compute 

n 

P(n; - O = J2q{n: fc) P(n; = < I Kr. = fc). (22) 

k=l 

Assuming that tt^ has boxes of sizes ni, . . . , , and that the SA„ operation acts on an exchangeable 
Tin with EPPF p, we deduce (gH) from ^ and 

P(n'„-<|i^„ = i)-i ^ p(...,?^,...), (23) 

j:nj=l 

P(n; = 7r;|i^„ = fc)= ^ VA2p(...,„^. _/, + !,...), k>2. (24) 

j:nj>k 

Consider (|24|l first. For the event (11^ = tt^) to occur there must be some j with Uj > fc. For each 
such j, corresponding to a box of tt^ with at least fc balls, the result (H^ = tt^) might be obtained 
by addition of fc — 1 balls to that box. The sequence of labels of these balls, in order of their choice, 
can be any one of nj{nj — 1) • • • (uj — fc + 2) sequences, and the final ball chosen to mark the box 
can be any one of — fc + 1 balls, making fc!('^^) choices out of a total of k\(^) possible choices. 
Given one of these kl(J]^) choices of fc balls, let Mk-i be the set of labels of the fc — 1 balls that are 
moved. Then the event — tt^J occurs if and only if the restriction of n„ to [n] — Mk-i equals 
the restriction of tt^ to [n] — Mk-i, which is a particular partition of n — fc + 1 labeled balls into 
boxes oi fii, . . . ,ni balls, where Uj = nil{i ^ j) + (n^ — fc + l)l(i = j). The conditional probability 
of (nj^ — TT^), given Kn = k and which of the k\{^^ possible choices of fc balls is made, is therefore 
p{. . . ,nj — fc + 1, . . .), by the assumed exchangeability of n„, and the definition of the EPPF p of 
n„ on compositions of m < n by restriction of n„ to subsets of size m. The evaluation H24() is now 
apparent, and H23() too is apparent by a similar but easier argument. 

□ 

Proposition 11. For each probability distribution qin : •) on [n], the corresponding SA„ tran- 
sition operator on partitions of [n] has a unique stationary distribution. A random partition with 
this stationary distribution is exchangeable, and its EPPF is the finite unique EPPF p that satisfies 
Mohle's recursion (jT)), that is (|21() with p' — p. 

Proof. If q{n : 1) = 1 then eventually SA„ terminates with singleton partition, so the stationary 
distribution is degenerate and concentrated on the singleton partition. If q{n : 1) = then even- 
tually SA„ terminates with one-block partition, so the stationary distribution is degenerate and 
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concentrated on the one-block partition. If < q{n : 1) < 1 then also q{n : fc) > for some fc > 1; in 
this case the stationary law is again unique because all states communicate: e.g. the pure-singleton 
partition I]„ is reachable from everywhere, and it can reach any partition in finitely many steps, as 
is easily verified. Observe that passing to shapes projects the SA„ chain with state space partitions 
of the set [n] onto another Markov chain whose state space is the set of partitions of the integer 
n. It follows easily that the unique stationary distribution of SA„ governs an exchangeable random 
partition of [n]. The previous lemma shows that its EPPF p solves Mohle's recursion. Finally, if an 
EPPF p solves Mohle's recursion, then it provides a stationary state for the SA„ chain. Hence the 
uniqueness result for solutions of Mohle's recursion by an EPPF p. □ 



5.1 Special cases 

Following are two special cases of SAm operation: 



Ewens' partition appears when q{n : •) may have only two positive entries 

, . 2p 1 / r,\ 71 — 1 

q(n : 1) = — and q[n : 2) = 



l + 2p ' ' n-l + 2p 

for each n > 2. It is easy to realize that the SA„ operation in this case is reduced to the following 
operation with u = 2p/{n — 1 -|- 2p): given a number < u < 1 and a partition of [n] as allocation 
of n labeled balls, first we uniformly sample two balls named A and D without replacement from 
the n balls (so A = D is excluded), then we put ball A back to where it was, and finally 

• with probability u append a new box containing the single ball D, 

• with probability 1 — u add the ball D to the box containing ball A. 

In this case, if we consider the FM operator determined by q, it is clear that only binary merges 
happen. That the stationary partition n„ follows the Ewens' sampling formula with parameter 
9 — {n ~ l)u/{l — u) is seen by the 'Chinese restaurant' rule |2H1 for transition from n„_i to n„, or 
can be easily concluded from the formula. The coincidence of the stationary distribution of this SA„ 
chain with the law of the induced final partition En of the associated FM„ chain confirms in this 
case the well known fact that Kingman's coalescent with mutations terminates at Ewens' partition. 

The SA„-chain resembles Moran's novel mutation chain |2f)l 1381 H?!) . Transitions of the latter are 
the following: given a number < u < 1 and a partition of [n] as allocation of labeled balls, first 
choose two balls named A and D uniformly and independently from the n balls (so A = D is not 
excluded), then follow the rules 

• with probability u append a new box with a single ball C, 

• with probability 1 ~ u add a ball C to the box that contains ball A, 

then assign to ball C the same label as that of D and finally remove ball D. It is well known [55] 
that the stationary law of Moran's chain corresponds to Ewens' partition with parameter nu/{l — u). 



Hook partitions. Another extreme case appears when q{n : •) may have only two positive entries 

q{n : 1) = — and q{n : n) = 



1 + np 1 + np 

In this case SA„ creates some number of singletons and then after some number of steps puts all 
balls in a single box. If < q{n : 1) < 1, the stationary distribution concentrates on partitions with 
a hook shape (m, 1,1,..., 1). This partition results from the A-coalescent with freeze when A = 6i 
is a Dirac mass at 1. 
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6 Infinite partitions 



In this section we pass from finite partitions to the projective hmit, and arrive at the desired 
integral representation of infinite decrement matrix q^o satisfying recursion H13|) . (|14|l . This gives 
another approach to Mohle's partitions via consistent freeze-and-merge chains, which may be seen 
as discrete-time jumping processes associated with the A-coalescent with freeze. 

An infinite sequence of freeze-and-merge operations FM :— (FM„,n — 1,2, . . .) which satisfies 
the condition in Definition |S1 for all positive integers 1 < m < n < oo is called consistent. By 
Lemma 13 such a sequence FM is determined by an infinite decrement matrix goo which satisfies the 
recursions (fl^ . ifTH) . 

For each n = 1,2,... the Markov chain starting from E* and driven by FM„ terminates with 
an induced final partition n„. These comprise an infinite partition Hoc = (nn)$^i which we call 
the final partition associated with consistent FM. In the case q{2 ; 1) = the final partition is the 
trivial one-block partition. 

Lemma 12. For every infinite decrement matrix Qoo with entries satisfying the recursion ()13|) . 
(|14|l there exist a non-negative finite measure A on [0, 1] and a non-negative real number p, which 
satisfy (A,p) =/= (0,0) and are such that the representation q{n : k) = <I>(n : fc)/<I>(n) {1 < k < n) 
holds with $ as in 0,, l|l()ll . (|ll|l . The data (A, p) are unique up to a positive factor. 

Proof. Suppose q solves (|13|l . (I14II and suppose q{2 : 2) < 1. Let ^{n),n = 1,2,... satisfy 

, , = 1 q(n + 1:1) q(n + I : 2) 25) 

$(n-|-l) n + ' n+l^^ ' ^ ' 

for n > 1; because the right side is strictly positive this recursion has a unique solution with some 
given initial value $(1) = p, where p > 0. For 2 < k < n set 

<i?(n : k) := q{n : fc)$(n), 

then from (jSKJ and ((T^ 

: fc) = .^ilLl$(n -M : fc + 1) + " ^ + 1 : k) (2 < fc < 7i< oo). 

n -\- 1 n + 1 

Apart from a shift by 2, this is the well-known Pascal-triangle recursion appearing in connection with 
de Finetti's theorem and the Hausdorff moment problem, hence (|10|l holds for some non-negative 
measure A on Borel sets of [0, 1]. From (|14|1 we find 

<S>(l)q(l : 1) <^>(n)q(n : 1) 

P= = ... = = •■• , 

1 n 

and from 

n 

^{n)q{n : k) = $(n) 

fe=i 

we deduce (111(1 and q{n : 1) = pn/<^{n). Setting by definition <I>(n : 1) :— pn we are done. For the 
special case q{2 : 2) = 1, it is easy to observe that p — Q, and we get A = (5o by similar analysis. □ 



Recording this lemma together with previous results, we have: 

Theorem 13. Let (n„)J^]^ he a nontrivial exchangeable random partition ofN, different from 
the trivial one-block partition. The following are equivalent: 

(i) The EPPF p satisfies recursion Q with some infinite decrement matrix goo • 

(ii) This matrix is representable as q{n : k) = $(n : fc)/$(n) with $ defined by (O, (|10|l . (|ll|l and 
some nontrivial (A,p), which is unique up to a positive factor. 
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(ii) This Hoc is induced by the final partition of some standard A-coalescent freezing at rate p. 

(iii) This Hoc is the final partition of some consistent FM operation. 

Complementing this result, we have the following uniqueness assertion. 

Lemma 14. The correspondence q p between infinite decrement matrices with q{2 : 1) > 
satisfying consistency (|13|) . I|14|) and the EPPF's is bijective. 

Proof. We only need to show that p, which by Lemma|Hlmust solve 0, uniquely determines q. For 
general infinite partitions q{2 : 1) = p{l, 1) > implies that p(l, 1, . . . , 1) > 0. This applied to the 
singleton shapes together with 

p(l,...,l)-q(n:l)q(n-l:l)-. -9(2:1) 

shows that the q{n : l)'s are uniquely determined by p. To show that q{n : m) for 1 < jti < n — 1 is 
also determined by p, exploit the formula 

qin : m) , 
p(m,l,...,l) = + 

m-1 /in\ 

V q{n : k))^p{m - fc + 1, 1, . . . , 1) + g(n : l)'l^p{m,l, 1, . . . , 1), 
( , I n 

k=2 ^kJ 

and argue by induction in m = 2, 3, . . . , n — 1. □ 



Thus if an exchangeable infinite partition can be realized as the induced final partition of a 
consistent FM-operation, then this FM-operation is unique. The realization via a (A, p)-coalescent 
process is unique up to a positive multiple of the parameters, which corresponds to a linear time- 
change of the coalescent. If there is no freeze the uniqueness fails, since any A-coalescent terminates 
with the trivial one-block partition. 

We classify next the cases when some of the entries of q are zeros. It is assumed that the starting 
partition is S^. 

(i) If q{n : 1) = 1 holds for n = 2 then the same holds for n >2. This is the pure-freeze coalescent 
with A = 0, hence Eoo = Soo- 

(ii) If q{n : 1) = holds for n — 2 then the same holds for n > 2. This is a A-coalescent with no 
freeze, hence Eoo is the one-block partition. 

(iii) If q{n : 1) > 0, q{n : 2) > and q{n : 1) + q{n : 2) = 1 hold for n = 3 then the same relations 
hold for n > 3. This is the case of Kingman's coalescent with freeze, A is a positive mass at 
0, and Eao is Ewens' partition. Id 

(iv) if q{n : 1) > 0, q{n : n) > and q{n : 1) -|- q(n : n) = 1 hold for n — 3 then also for n > 3. In 
this case A is a positive mass at 1, and i?oo is a hook partition. 

The 'generic' case is characterised by g(3 : 1) > 0, q{3 : 2) > 0, g(3 : 3) > 0, in which case 
< q{n : m) < 1 for all 1 < m < n < oo. 



7 Positivity 

This section provides a construction of decrement matrices qco satisfying the consistency condition 
(|13|l . p4|l . from a single sequence of real numbers satisfying a positivity condition. For {c{n),n = 
0, 1,2,...) a sequence of real numbers, the backward difference operator V is defined as 

\/c{n) :— c{n) — c{n + 1), 
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and for any j = 0, 1, 2, . . . its iterates act as 



4 = 



Now let = 1,2, . . .) be a sequence of real numbers and p be a positive real number. 

Define for each n 

$(n : 1) := pn, (26) 



and 
Define 

and let 



$(n) := $(n) - pn. (27) 



:= (28) 
n 

$(n:TO) := -^""j V"~^*("--"^ + l), 2 < m < n. (29) 
With these definitions, it can be verified that for each n 

$(n) = $(n : 1) + $(n : 2) H h$(n:n). (30) 

Hence if all <i>(n) are positive and all $(n : to) are non-negative, the matrix with entries 

^{n : to) 

q[n : mj — - — , 1 < m < n (31) 

$(n) 

is a well defined infinite decrement matrix. More than that, we have the following observation: 

Lemma 15. Suppose that a sequence of positive real numbers p, $(n), n = 1,2, ... is such that 
each entry $(n : 1), $(ri : to) in (|26|) . H29I) is non-negative. Then the matrix H31|) satisfies the 
recursion (|13|) . 1141) . 

Proof. The definition (|29(l of <i>(n : to) implies the recursion 

m ^1 n — 771 1 

<I>(n : m) = $(n + 1 : TO + 1) H $(« + 1 : to), 2 < to < rt. (32) 

+ 1 n + 1 

Using this relation, the first recursion H13(l can be reduced to 

2$(n + 1 : 2) = (n + l)($(n + 1) - $(?i)) - $(?i + 1:1) 

which follows from definition of ^{n +1:2) and (f>(n + 1:1). The second recursion is actually the 
definition of (f>(n + 1:2) after we plug in all the $(n : 1), $(n + 1:1) terms. □ 



The above lemma shows that given a sequence of positive real numbers with some additional 
positivity property, we can recover Mohle's partition structure by first defining a consistent decre- 
ment matrix, then using the recursion 0. By Lemma [HI we know that every decrement matrix 
satisfying consistency condition 1)13(1 p4|l has an integral representation which is unique up to a 
positive factor, so it is clear that we also have integral representation for the sequence of <i>(n) given 
here: 

Proposition 16. A sequence of positive real numbers p, $(n), n = 1,2,... is such that each 
entry <i>(n : 1), <i>(n : to) as in H26|) , H29|) is non-negative if and only if these numbers admit the 
integral representation (P|). (|10|l . (|ll() for some non-negative finite measure A on [0, 1], which is then 
unique. 



16 



8 Freezing times 



In this section (n*(i), i > 0) is a standard (A, p)-coalescent, with {Il{t),t > 0) induced ordinary par- 
titions, and Eoo final partition. We assume that both A and p are nonzero. The process (11° (t), t > 0) 
will denote the standard A-coalescent. We presume that all (A, p)-coalescents are defined consis- 
tently as p varies, so that the n(t)'s and Eoc get finer as the freezing rate p increases, in particular 
each partition 11 (i) is finer than n"(i), for each i > and p > 0. 

8.1 Age ordering 

Assigning each individual j G N the freezing time Tj , when the active block containing j gets frozen, 
the final partition Eoo is defined by sending i,j to the same block if and only if = Tj. The 
correspondence j i— > Tj induces a total order on the set of blocks of Eoo- we say that the block 
containing j is older than the block containing z if < Tj . With this age ordering, Eoo is an ordered 
exchangeable partition of N, as studied in [71 . 

We preserve the notation Eoo = (En) to denote the partition with this additional feature of total 
order on the set of the blocks. The law of ordered partition Eoo is determined by an exchangeable 
composition probability function (ECPF) c{ni, . . . ,ni) on compositions of n. The ECPF c must 
satisfy an addition rule similar to ^ but, unlike p, need not be symmetric. The EPPF p of unordered 
partition is recovered from c by symmetrization. See jl4j for details. 

With each j we associate a random open interval ]aj ,bj[, where 

a,, = lim =f/={i < n : Ti < tA / n , bj — aj = lim =f^{i < n : Ti = tA /n , (33) 

and the existence of the frequencies is guaranteed by de Finetti's theorem. Thus aj is the total 
frequency of blocks preceding the block containing j, and bj — Oj is the frequency of the block 
containing j. The random open set U — Uj]aj,6j[ is the paintbox representing Eoo- The partition 
-Eoo can be uniquely recovered from J7 by a simple sampling scheme [191 121)1 114) . 

For instance, when A = Jq, the complement closed set is = {1, Yi, YiY2, . . . , 0} for Y^-'s 
independent random variables whose distribution is beta(2p, 1). This case has been thoroughly 
studied 0|H|, and it is well known that the arrangement of the block sizes in the age order is inverse 
to the arrangement in size-biased order. In the case A = i5i, the set U has only one interval ]F, 1[, 
where Y has a beta distribution. 

8.2 Properties of the final partition 

Some properties of U for a (A, p)-coalescent with p > follow from known results about the A- 
coalescents [HJ- We shall discuss only the case A{1} = 0, since the case A{1} > only differs by an 
independent exponential killing and its properties easily follow from that in the case A{1} = 0. Let 

Pr '■= I x^A{dx). 
Jo 

Denote Leb the Lebesgue measure on [0, 1]. In the event Leb(J7) < 1 the ordered partition Eoo with 
paintbox U has a positive total frequency of singletons blocks, and in the event Leb([/) = there 
are no singleton blocks at all. 

Proposition 17. If p-i < oo then with probability one 

(i) Il'^{t) has singletons, for each t > 0, 

(ii) n*(<) has active singletons, for each t > 0, 

(iii) n*(<) has frozen singletons, for each t > 0, 

(iv) Eoo has singleton blocks. 
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If fj--i = oo then the opposites o/(i)-(iv) hold with probability one. 

Proof. By tSIl Lemma 25], if < oo then 11° (t) has singletons almost surely , and if ~ oo the 
partition has no singletons almost surely. Now, if II" (t) has singletons each of them is active with 
probability < e"''' < 1, independently of the others, thus the partially frozen partition Il*{t) has 
singletons in both conditions, and the frozen ones are also singleton blocks of Eoo • Conversely, if with 
positive probability Eoo has singletons then for some t with positive probability II* (t) has frozen 
singletons, then, perhaps for some other t, with positive probability 11* (t) has active singletons, but 
in this event the partition Tl^(t) has singletons, hence = oo cannot hold. □ 

By |31l Proposition 23] the A-coalescent either comes down from infinity (the number of blocks 
in n*'(t), is finite almost surely for every t > 0) or stays infinite (the number of blocks is finite). 

Proposition 18. If the A-coalescent stays infinite, then the {A, p)-coalescent has infinitely many 
active blocks at any time, therefore 

(i) the set of freezing times {tj } is dense in R+, 

(ii) the closed set has empty interior and no isolated points. 

If the A-coalescent comes down from infinity, then the [A, p)-coalescent satisfies 
(i') the set of freezing times {tj } is bounded and only accumulates near 0, 
(ii') the closed set only accumulates near 0. 

Proof. Let Jk be the minimal element in some block Ak of n*'(f). Then Jk is also the minimal 
element in some block Bk C Ak of ir{t). Since the block containing Jk changes the condition from 
active to frozen independently of the A-coalescent, with positive probability 1 — e^''* the block Bk is 
active. For fc = 1, 2, . . . these events are independent, hence H*{t) has infinitely many active blocks. 
But the same is true for t + e, hence arguing as in Proposition 1171 we see that infinitely many of 
the active BkS get frozen before t -\- e, whence (i). Moreover, infinitely many of the active Bk's are 
nonsingleton, hence, by the law of large numbers for exchangeable trials, have positive frequency. 
The assertion (ii) follows now from this remark, (i) and H33|l . □ 

9 Comparision with regenerative partitions 

This section is devoted to parallels and differences between Mohle's partitions and regenerative 
partitions |14[ I15j . A novel feature discussed here is a realization of regenerative partitions by a 
simple continuous-time coalescent process. 

9.1 Continuous time realization and EPPF 

Consider a T'^-valued Markovian process (11^ (t), t > 0) which starts with S^(t) and evolves by 
the following rules. Any number of active singleton blocks can merge to form a single frozen block, 
which suspends further evolution immediately. In particular, an active singleton block can turn into 
frozen singleton block, an event interpreted as unary merge. If Unit) has b active blocks, each k- 
tuple is merging at the same rate, so that the total rate for a fc-merge is $(& : fc), for 1 < fc < 6 < oo, 
and $(1 : 1) > 0. 

Eventually there are only frozen blocks whose configuration determines a final partition i?oo- 



Setting $(&) := $(fe : 1) + . . . + $(6 : b) and q{n : fc) $(n : fc)/$(n), the EPPF of Eoo satisfies 




(34) 
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for any composition (ni, 712, • ■ • , n^) of n, which is a recursion analogous to Q. This aUows an 
exphcit formula 

p(ni,n2,...,nf) = 2^ '-^^^ r '-^ (35) 

(7 Vni,...,n^/ 

where the sum is over all permutations cr : [£] — > [£], and -/Vg-Q) = ^^(j) + • • • + n„(^gy 
9.2 Subordinator 

Exchangeability implies the existence of a nonnegative finite measure on [0, 1] such that 



$(6 : fc) = (^^j y - .T)''-''A(da;), (36) 

a representation to be compared with Hl()|l . The cumulative rate for some transition when n„ (<) has 
b active blocks equals 

$(6) := $(& : 1) + . . . + $(6 : &) / ^ ~ ~ ^)'' A(dx). 

The last formula is an integral representation of a Bernstein function, hence the measure A(dx)/x 
can be associated with some subordinator Explicitly, by de Finetti's theorem there exists the 
limit proportion St of integers in [n] that comprise the active blocks of n*(f), as n — s- 00. The 
process (— log(l — St)^ t > 0) is a subordinator with Sa — Q and distribution determined by 

E[(l-5't)^] =e-**(^), t>0, A>0, 

which is a version of the Levy-Khintchinc formula in the form of the Mellin transform. The subor- 
dinator has a drift if A has an atom at 0. 

Putting the blocks of Eca in increasing order of their freezing times yields an ordered exchangeable 
partition with ECPF 

p{ni,...,ni) = 11 , 

where Nj :— nj + - ■ - + 711. The closed range of the process (St) is the complement If^ to the paintbox 
U of the ordered partition Eoo ■ 



9.3 Related Markov chains 
9.3.1 Transient 

For regenerative partitions the analogue of FM„ introduced in Section 0] is the following. Let 
Qoo = {lib '. k),l < k < h < 00} be a decrement matrix. If there are b active blocks in a partially 
frozen partition of [n], then with probability q{b : k) any k oi b active blocks are chosen uniformly 
at random and merged into a single frozen block. 
Consistency translates as the recursion 

<l{b ■■ k) = \^q{h + 1 : fc + 1) + ^X]_~^ q{b + 1 : fc) + -l-q{b + 1 : \)q{b : k) (37) 

0+1 0+1 0+1 

with q{l : 1) — 1, which leads to 

g(6 : fc) = $(fe : fc)/$(6) (1 < fc < < 00), 

where $ has the above integral representation (|36|l with some measure A unique up to a positive 
multiple. 
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9.3.2 Recurrent 



The analogue of operation SA„ introduced in Section [SJ acting on ordinary partitions of [n], is the 
following T5^. Given a decrement matrix q, let Kn follow q(n : ■). Choose a value k for Kn, then 
starting from some partition 7r„ of [n] sample k balls from 7r„ uniformly without replacement, and 
then append a new box with these k balls to the remaining partition of ri — fc balls. According to an 
ordered version of the algorithm, acting on ordered partitions, the balls are sampled from a totally 
ordered series of boxes, and the newly created box is always arranged as the first box in the series. 

In contrast to the SA„ operation, these Markov chains on partitions of [n] are consistent under 
restrictions as n varies. To see that the operations SA„ are not consistent as n varies (exluding the 
hook case q{n : 1) + q{n : n) = 1) fix n > 2 and let 7r„_|_i be a partition having a singleton block 
{n + 1}. There is a chance that some 2 < r < n balls are sampled from Tr„+i and added in the box 
{n + 1}. In this case the restriction of SA„+i to [n] creates a novel nonsingleton box, which is not 
a legitime option for SA„. 

In it was shown that the unique stationary [n]-partition is the one given by (|35|l . 

Example. When 

, -.x np 1 

lin ■■ 1) = — — , q{n ■ 



1 + np 1 + np 

the operation will create a new singleton block with probability q(n : 1), and merge everything in one 
block with probability q{n : n). So the stationary distributions will concentrate on hook partitions. 
The decrement matrix for this chain is the same as for SA„. 
Example. When 

/ ^ [9]n-nim\ 

q{n:m)^l ] j^-—. , 38) 

\mj [0 + l\„-in 

with 9 = 2p, the invariant partition is Ewens' with parameter 9. The decrement matrix for this 
chain is different from the one for SA„, which also leads to Ewens' distribution. 

9.4 Comparing decrement matrices 

In jl4| we found very similar recursions for entries of decrement matrix which characterizes a re- 
generative composition structure, hence a regenerative partition structure in 15 . According to |14[ 
Proposition 3.3], a non-negative matrix q is the decrement matrix of some regenerative composition 
structure if and only if q{l : 1) = 1 and H37|) holds for 1 < k < b. Comparing with Lemma [7| above, 
the difference from our recursions here is that we have a separate recursion for q{b : 1), and we have 
an extra term 

^q{b+l:2)q{b:k) 

in right hand side of recursions for q{b : k), k > 2. Both of them are backward recursions. For the 
purpose of illustration, suppose we are given q{A : k), k = 1, 2, 3, 4, the entries q{b : •) with & < 3 of 
decrement matrix for regenerative composition structure would be: 



q{3: 


:3) 


q{3: 


:2) 


<Z(3: 


:1) 



4(?(4: 


4) + q{4 : 


3) 


4 


-9(4:1) 




3<Z(4: 


3) + 2g(4 


:2) 


4 


- g(4 : 1) 




2<Z(4: 


2) + 3g(4 


:1) 



4 - g(4 : 1) 



q{2 : 2) 



q{2 : 1) 



3q{3 : 3) + q{3 : 2) 6g(4 : 4) + 3q(4 : 3) + q{4 : 2) 



3-9(3:1) 
2q{3 : 2) + 2q{3 : 1) 
3 - <7(3 : 1) 



6 - 3g(4 : 1) - (7(4 : 2) 
3g(4:3) + 4g(4:2) + 3g(4: 1) 
6-3g(4:l)-g(4:2) 
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While for decrement of the partition structure studied here, we have 



,(3:3)- 4,(4: 4) +,(4: 3) 



g(3 : 2) = 
q{3 : 1) 



4 -g(4:l)- 2,(4:2)' 

3,(4 : 3) + 2,(4 : 2) 
4-,(4:l)-2,(4:2)' 
3,(4 : 1) 



,(2 : 2; 
,(2 : 1) = 



4 -,(4:1) -2,(4: 2)' 
3,(3 : 3) + ,(3 : 2) 6,(4 : 4) + 3,(4 : 3) + ,(4 : 2) 



3 - ,(3 : 1) - 2,(3 : 2) 6 - 3,(4 : 1) - 5,(4 : 2) - 3,(4 : 3) 
2,(3 : 1) _ 3,(4 : 1) 



3 - ,(3 : 1) - 2,(3 : 2) 6 - 3,(4 : 1) - 5,(4 : 2) - 3,(4 : 3) ' 

10 Comparison with Markovian fragmentations 

The theory of homogenous and self-similar Markovian fragmentation processes due to Bertoin is 
formulated much like the present theory of coalescents in terms of consistent partition-valued pro- 
cesses. Ford |lll Proposition 41] provides a sampling consistency condition for decrement matrices 
associated with discrete fragmentation processes which is an extremely close relative of our Lemma 
The article 18 provides an integral representation for such decrement matrices, analogous to 
our results for the decrement matrices associated with regenerative partition structures and with 
Markovian coalescents, and embeds Ford's result in the broader context of continuous time fragmen- 
tation processes and continuum random trees. A missing element of the fragmentation discussion 
is some way of deriving a partition structure by a recursion like or H12|l . But we expect such a 
partition structure and an associated recursion may be associated with a suitably defined Markovian 
fragmentation with freeze, such as that introduced in |12! ■ 
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